Colorizing the Prokudin-Gorskii Photo Collection
Name: Riyaz Panjwani
Overview
In this project, we are working on a collection of greyscale images from the famous Sergei Mikhailovich Prokudin-Gorskii. The images are three-channel (RGB) images for which may be which might have undergone some transformations such as translation or rotation. The goal of the assignment is to overlay the three images in a way that minimizes the artifacts and produces a color image.
Approach
We have followed several image processing techniques to achieve a perfect image composition. First, we started with Template matching by exhaustively minimizing Squared Differences (SSD) distance and Normalized Cross-Correlation (NCC), over a [-15, 15] moving grid of pixels. The values corresponding to the lowest SSD or highest NCC were stored as the best possible translation coordinates for image alignment. Second, since few of the images were high-resolution images. The brute-force solution would be expensive in resources; hence we implemented a faster search procedure using an image pyramid, which downsamples & processes the images & then upsamples the corresponding layers to obtain the final offset. Finally, we enchased our aligned image results using various image processing techniques such as Auto cropping, Auto contrasting, Auto-white balance, better features mapping, Aligning, and processing data from other sources.
explanation
Before we apply any of the image processing methods, we pre-processed the image by splitting it into the corresponding RGB channels. We assumed that each channel is equally divided in space. Also, since these channels might already have some boundries we shave off 3% of each dimension to obtain a highly refined image. After preprocessing each of these individual channels we applied both we computed SSD & NCC (see the formula below) using a moving [-15, 15] filter. We had a base image (green channel), and two reference images (red & blue channel). The function tracked the optimal values for both and stored the best translation offsets for each reference image with respect to the base image. Once, the obtained the offsets, we translated the reference images & aligned all of them with the base image. In almost all the images, we found NCC to be more robust than SSD. For large images we used image pyramid method, where in we downsampled the image by a factor of 2 and then upsampled it after processing the image.
Results
NCC SSD Raw
NCC Loss Offset for Images
Image Name (Channel) | Offset X | Offset Y |
cathedral.jpeg (G) | 5 | 2 |
cathedral.jpeg (R) | 12 | 3 |
emir.tiff(G) | 49 | 24 |
emir.tiff (R) | 104 | 42 |
harvesters.tiff (G) | 61 | 16 |
harvesters.tiff (R) | 124 | 13 |
icon.tiff (G) | 41 | 17 |
icon.tiff (R) | 89 | 23 |
lady.tiff (G) | 59 | -5 |
lady.tiff (R) | 119 | -10 |
self_portrait.tiff (G) | 82 | -2 |
self_portrait.tiff (R) | 127 | -8 |
three_generations.tiff (G) | 55 | 12 |
three_generations.tiff (R) | 112 | 10 |
train.tiff (G) | 44 | 2 |
train.tiff (R) | 88 | 30 |
turkmen.tiff (G) | 56 | 18 |
turkmen.tiff (R) | 115 | 26 |
village.tiff (G) | 66 | 13 |
village.tiff (R) | 127 | 24 |
SSD Loss Offset for Images
Image Name (Channel) | Offset X | Offset Y |
cathedral.jpeg (G) | 5 | 2 |
cathedral.jpeg (R) | 12 | 3 |
emir.tiff(G) | 49 | 24 |
emir.tiff (R) | 104 | 42 |
harvesters.tiff (G) | 71 | 39 |
harvesters.tiff (R) | 124 | 13 |
icon.tiff (G) | 41 | 17 |
icon.tiff (R) | 89 | 23 |
lady.tiff (G) | 59 | -5 |
lady.tiff (R) | 117 | -10 |
self_portrait.tiff (G) | 82 | -2 |
self_portrait.tiff (R) | 127 | -8 |
three_generations.tiff (G) | 55 | 12 |
three_generations.tiff (R) | 112 | 10 |
train.tiff (G) | 44 | 2 |
train.tiff (R) | 88 | 30 |
turkmen.tiff (G) | 56 | 18 |
turkmen.tiff (R) | 115 | 26 |
village.tiff (G) | 66 | 13 |
village.tiff (R) | 127 | 24 |
Bells & Whistle
Auto-Crop
For this usecase, we extracted a portion of top-left & bottom-right image of the aligned color image. We observed that in case of misaignment there’s a black border across the entire image. We then applied Sobel filter to find the edges. Using this we were able to find the offset that we need to shift to crop the entire image. We then applied image transformation techniques to obtain the final image.
RESULTS
NCC Cropped
Auto-WB Adjusted
To obtain the white-black adjusted image, we computed the worked on the cropped image from above to minimize the image artifacts. We then computed the mean across each channel, and normalized the values across them. The final image had an equitable histogram distribution.
RESULTS
Cropped WB Adjusted
Auto-Contrast
Fot the case of auto-contrast, we worked on the cropped image. We then calculated calculated grayscale histogram of transformed image. After this we located points to clip at 1% left & right levels. Finally, we used cv2 convertScaleAbs to obtain the final contrasted image.
RESULTS
Cropped Contrast Adjusted
Feature-Based Alignment
The technique we will use is often called “feature based” image alignment because in this technique a sparse set of features are detected in one image and matched with the features in the other image. A transformation is then calculated based on these matched features that warps one image on to the other. We split the image into three parts & then registered each channel with respect to a reference channel. Finally, all the channels were concatinated into one RGB color image.
RESULTS
Raw Feature-based alignment
4D Search
In this method, we searched the transformed grid, with translation (X & Y), scaling (0.5X to 2X increments of 0.5) & rotation (-1.5 to 2 degrees increments of 0.5 degrees). While the complexity increased significantly, we were able to see minor enhancements in the aligned image. As with above images, we then applied auto_crop to crop the unwanted borders in the image after the alignment of all the channels.
RESULTS
Raw 4D Search
Interpolation
In this technique, we used interpolation methods to align channels. We use various methods, but found nearest neighbor interpolation & bilinear interpolotion to be most effective. There were minor enhancements in the image quality for large images. We use PyTorch to implement this technique.
RESULTS
Raw Interpolation